3,962 research outputs found

    A goal-oriented verification-based approach for target text line extraction from a document image captured by a pen scanner

    Get PDF
    We present a goal-oriented verification-based approach for target text line extraction from a document image captured by a pen scanner. Given a binary image, a series of processing steps are invoked adaptively, guided by the text line verification result in the preceding step. Each step adopts a strategy that is most effective for dealing with the problem concerned. Consequently, the target text line can be extracted in a more efficient and reliable way depending on the nature of the captured image. The effectiveness of the above approach is confirmed by a benchmark test.published_or_final_versio

    A study of prior sensitivity for Bayesian predictive classificationbased robust speech recognition

    Get PDF
    We previously introduced a new Bayesian predictive classification (BPC) approach to robust speech recognition and showed that the BPC is capable of coping with many types of distortions. We also learned that the efficacy of the BPC algorithm is influenced by the appropriateness of the prior distribution for the mismatch being compensated. If the prior distribution fails to characterize the variability reflected in the model parameters, then the BPC will not help much. We show how the knowledge and/or experience of the interaction between the speech signal and the possible mismatch guide us to obtain a better prior distribution which improves the performance of the BPC approach.published_or_final_versio

    A Data Structure Using Hashing and Tries For Efficient Chinese Lexical Access

    Get PDF
    A lexicon is needed in many applications. In the past, different structures such as tries, hash tables and their variants have been investigated for lexicon organization and lexical access. In this paper, we propose a new data structure that combines the use of hash table and tries for storing a Chinese lexicon. The data structure facilitates an efficient lexical access yet requires less memory than that of a trie lexicon. Experiments are conducted to evaluate its performance for in-vocabulary lexical access, out-of-vocabulary word rejection, and substring matching. The effectiveness of the proposed approach is confirmed.published_or_final_versio

    On-line Bayes adaptation of SCHMM parameters for speech recognition

    Get PDF
    On-line adaptation of semi-continuous (or tied mixture) hidden Markov model (SCHMM) is studied. A theoretical formulation of the segmental quasi-Bayes learning of the mixture coefficients in SCHMM for speech recognition is presented. The practical issues related to the use of this algorithm for on-line speaker adaptation are addressed. A pragmatic on-line adaptation approach to combine the long-term adaptation of the mixture coefficients and the short-term adaptation of the mean vectors of the Gaussian mixture components are also proposed. The viability of these techniques are confirmed in a series of comparative experiments using a 26-word English alphabet vocabulary.published_or_final_versio

    On-line adaptive learning of the correlated continuous density hidden Markov models for speech recognition

    Get PDF
    We extend our previously proposed quasi-Bayes adaptive learning framework to cope with the correlated continuous density hidden Markov models (HMMs) with Gaussian mixture state observation densities in which all mean vectors are assumed to be correlated and have a joint prior distribution. A successive approximation algorithm is proposed to implement the correlated mean vectors' updating. As an example, by applying the method to an on-line speaker adaptation application, the algorithm is experimentally shown to be asymptotically convergent as well as being able to enhance the efficiency and the effectiveness of the Bayes learning by taking into account the correlation information between different model parameters. The technique can be used to cope with the time-varying nature of some acoustic and environmental variabilities, including mismatches caused by changing speakers, channels, transducers, environments, and so on.published_or_final_versio

    Online adaptive learning of continuous-density hidden Markov models based on multiple-stream prior evolution and posterior pooling

    Get PDF
    We introduce a new adaptive Bayesian learning framework, called multiple-stream prior evolution and posterior pooling, for online adaptation of the continuous density hidden Markov model (CDHMM) parameters. Among three architectures we proposed for this framework, we study in detail a specific two stream system where linear transformations are applied to the mean vectors of the CDHMMs to control the evolution of their prior distribution. This new stream of prior distribution can be combined with another stream of prior distribution evolved without any constraints applied. In a series of speaker adaptation experiments on the task of continuous Mandarin speech recognition, we show that the new adaptation algorithm achieves a similar fast-adaptation performance as that of the incremental maximum likelihood linear regression (MLLR) in the case of small amount of adaptation data, while maintains the good asymptotic convergence property as that of our previously proposed quasi-Bayes adaptation algorithms.published_or_final_versio

    A Study On the Use of 8-Directional Features For Online Handwritten Chinese Character Recognition

    Get PDF
    published_or_final_versio

    On-line adaptive learning of the continuous density hidden Markov model based on approximate recursive Bayes estimate

    Get PDF
    We present a framework of quasi-Bayes (QB) learning of the parameters of the continuous density hidden Markov model (CDHMM) with Gaussian mixture state observation densities. The QB formulation is based on the theory of recursive Bayesian inference. The QB algorithm is designed to incrementally update the hyperparameters of the approximate posterior distribution and the CDHMM parameters simultaneously. By further introducing a simple forgetting mechanism to adjust the contribution of previously observed sample utterances, the algorithm is adaptive in nature and capable of performing an online adaptive learning using only the current sample utterance. It can, thus, be used to cope with the time-varying nature of some acoustic and environmental variabilities, including mismatches caused by changing speakers, channels, and transducers. As an example, the QB learning framework is applied to on-line speaker adaptation and its viability is confirmed in a series of comparative experiments using a 26-letter English alphabet vocabulary.published_or_final_versio

    A study of prior sensitivity for Bayesian predictive classificationbased robust speech recognition

    Get PDF
    We previously introduced a new Bayesian predictive classification (BPC) approach to robust speech recognition and showed that the BPC is capable of coping with many types of distortions. We also learned that the efficacy of the BPC algorithm is influenced by the appropriateness of the prior distribution for the mismatch being compensated. If the prior distribution fails to characterize the variability reflected in the model parameters, then the BPC will not help much. We show how the knowledge and/or experience of the interaction between the speech signal and the possible mismatch guide us to obtain a better prior distribution which improves the performance of the BPC approach.published_or_final_versio
    corecore